In a typical situation involving a television or other system presenting a video with accompanying audio, the audio is a single audio stream heard by all viewing users. In some examples, such as with DVD movies, the audio track played along with the video is selectable, enabling selection of tracks with foreign language overdubs or expanded/enhanced audio channeling, as examples. However, even in these cases the audio stream presented to the viewers with the video content is one-dimensional in that all sounds are meshed together into the stream regardless of the number and nature of activities occurring in the scene. It does not account for the ability to tailor the audio delivered to the users depending on where in the scene the users have focused their attention. To the extent that a particular sound is to be heard by the users, the sound is included to the single audio stream that is played with the video. This results in a generalized audio stream where audio features from particular portions of the scene may be difficult to discern, requiring the content creator to make selections about the dominant sounds of the stream regardless of where the interests of the individual users lie.